iT邦幫忙

第 12 屆 iThome 鐵人賽

DAY 28
3
Software Development

服務開發雜談系列 第 28

分布式追蹤服務 Jaeger 簡介與安裝

  • 分享至 

  • xImage
  •  

昨天介紹了OpenTelemetry的名詞跟概念.
今天來架設其中一款支持OpenTelemetry的追蹤系統

Jaeger

Jaeger是CNCF項目之一, 受到DapperOpenZipkin的起發.
由Uber開源的分布式追蹤系統, 用來監控和診斷鏈路分布式系統.
Uber也在自己的Blog上發表了一篇文章Evolving Distributed Tracing, 講解了Uber在分布式追蹤從一開始到Jaeger的誕生.

Architecture

Jaeger的服務架構如下圖

Jaeger Client

其實就是Jaeger client的Library, 有對OpenTelemetry和OpenTracing進行了實現.

Jaeger Agent

也是Sidecar模式的實現, 負責把client透過UDP發出的spans給批量推送到Collector上.
主要是為了屏蔽client對於collector的路由實做細節.

Jaeger Collector

收集Spans, 把Span經過驗證、轉換、索引並且寫入DB內.
Colletcor能設定Sampling採樣邏輯, 根據Sampling的設定進行收集和處理.

因為這組件是無狀態的, 所以可以建立很多個Collector加速寫入到DB.
DB支援了Cassandra、Elasticsearch、Kafka.

官方建議是用Cassandra, 原因有2. Cassanda是一個K-V資料庫, 對於用TracdID來搜尋的場景效率很高. 且寫入吞吐量相當好.

但若是為了分析查詢, 還是Elasticsearch實在.

Jaeger Query

接收查詢請求, 然後從DB中檢索, 並透過UI展示.
Jaeger Query也是無狀態的, 所以可以啟動多個實例.

Docker-Compose安裝

安裝Jarger-Collector、Jarger-Agent、Jaeger-Query.
和Elastic Cluster(Master+Node)

version: "3.6"

services:
    jaeger-collector:
        image: jaegertracing/jaeger-collector
        command: 
            - --es.num-shards=2
            - --es.num-replicas=0
            - --es.server-urls=http://172.16.230.100:9200,http://172.16.230.102:9201
            - --collector.zipkin.host-port=:9411
        ports: 
            - "14269"
            - "14268:14268"
            - "14250"
            - "9411:9411"
        environment:
            - SPAN_STORAGE_TYPE=elasticsearch
            - LOG_LEVEL=debug
        networks:
            jaeger_net:
                ipv4_address: 172.16.230.2
        depends_on:
            - elasticsearch-master
    jaeger-query:
        image: jaegertracing/jaeger-query
        command: 
            - --es.num-shards=2
            - --es.num-replicas=0
            - --es.server-urls=http://172.16.230.100:9200,http://172.16.230.101:9201
        ports:
            - "16686:16686"
            - "16687"
        environment: 
            - SPAN_STORAGE_TYPE=elasticsearch
            - LOG_LEVEL=debug
        networks:
            jaeger_net:
                ipv4_address: 172.16.230.3
        depends_on:
            - elasticsearch-master
    jaeger-agent:
        image: jaegertracing/jaeger-agent
        command: 
            - --reporter.grpc.host-port=jaeger-collector:14250
            - --reporter.grpc.retry.max=1000
        ports:
            - "5775:5775/udp"
            - "6831:6831/udp"
            - "6832:6832/udp"
            - "5778:5778"
        environment: 
            - LOG_LEVEL=debug
        networks:
            jaeger_net:
                ipv4_address: 172.16.230.4
        depends_on: 
            - jaeger-collector
    elasticsearch-master:
        container_name: es-master01
        hostname: es-master01
        image: elasticsearch:7.1.1
        volumes:
            - ./elasticsearch/master/conf/es-master.yml:/usr/share/elasticsearch/config/elasticsearch.yml
            - ./elasticsearch/master/data:/usr/share/elasticsearch/data
            - ./elasticsearch/master/logs:/usr/share/elasticsearch/logs
        environment:
            - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
        ports:
            - 9200:9200
            - 9300:9300
        expose: 
            - 9200
        networks:
            jaeger_net:
                ipv4_address: 172.16.230.100
    elasticsearch-slave1:
        container_name: es-slave01
        hostname: es-slave01
        image: elasticsearch:7.1.1
        volumes:
            - ./elasticsearch/slave1/conf/es-slave1.yml:/usr/share/elasticsearch/config/elasticsearch.yml
            - ./elasticsearch/slave1/data:/usr/share/elasticsearch/data
            - ./elasticsearch/slave1/logs:/usr/share/elasticsearch/logs
        environment:
            - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
        ports:
            - 9100:9100
            - 9201:9201
        expose: 
            - 9201
        networks:
            jaeger_net:
                ipv4_address: 172.16.230.101
networks:
    jaeger_net:
        driver: bridge
        ipam:
            driver: default
            config:
            -
                subnet: 172.16.230.0/24

es-master.yml

cluster.name: es-cluster
node.name: es-master
node.master: true
node.data: true
network.host: 0.0.0.0
http.port: 9200
transport.port: 9300
discovery.seed_hosts:
  - 172.16.230.100
  - 172.16.230.101
cluster.initial_master_nodes:
  - es-master
http.cors.enabled: true
http.cors.allow-origin: "*"
xpack.security.enabled: false

es-slave1.yml

cluster.name: es-cluster
node.name: es-slave1
node.master: false
node.data: true
network.host: 0.0.0.0
http.port: 9201
discovery.seed_hosts:
  - 172.16.230.100
  - 172.16.230.101
cluster.initial_master_nodes:
  - 172.16.230.100
http.cors.enabled: true
http.cors.allow-origin: "*"
xpack.security.enabled: false

上面的一些command設定, 能參考這裡CLI flags
其實能設定的部份不多.

接著就能打開瀏覽器, 輸入http://172.16.230.3:16686/

搭配官方範例的example來試試看.
這裡我是打給Agen, Agent在傳送給Collector.
當然也能直接打給Collector就是了.
要看架構跟吞吐量.

package main

import (
	"context"
	"log"

	"go.opentelemetry.io/otel/api/global"
	"go.opentelemetry.io/otel/label"

	"go.opentelemetry.io/otel/exporters/trace/jaeger"
	sdktrace "go.opentelemetry.io/otel/sdk/trace"
)

// initTracer creates a new trace provider instance and registers it as global trace provider.
func initTracer() func() {
	// Create and install Jaeger export pipeline
	flush, err := jaeger.InstallNewPipeline(
		jaeger.WithAgentEndpoint("172.16.230.4:6831"),
		// jaeger.WithCollectorEndpoint("http://localhost:14268/api/traces"),
		jaeger.WithProcess(jaeger.Process{
			ServiceName: "trace-demo",
			Tags: []label.KeyValue{
				label.String("exporter", "jaeger"),
				label.Float64("float", 312.23),
			},
		}),
		jaeger.WithSDK(&sdktrace.Config{DefaultSampler: sdktrace.AlwaysSample()}),
	)
	if err != nil {
		log.Fatal(err)
	}

	return func() {
		flush()
	}
}

func main() {
	fn := initTracer()
	defer fn()

	ctx := context.Background()

	tr := global.Tracer("component-main")
	ctx, span := tr.Start(ctx, "foo")
	bar(ctx)
	span.End()
}

func bar(ctx context.Context) {
	tr := global.Tracer("component-bar")
	_, span := tr.Start(ctx, "bar")
	defer span.End()

	// Do bar...
}

在JaegerUI上選擇trace-demo, 按下Find Traces.
就會看到Traces了. 點進去就會看到.

foo就是Parent Span,
bar則是Sub Span.

這樣基本的就完成了環境建置.

剩下的明天再來看看.


上一篇
Distributed Tracing & OpenTelemetry介紹
下一篇
Jaeger續, DAG套件與更多案例
系列文
服務開發雜談33
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言